Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 47
Filter
3.
JMIR Cardio ; 8: e53421, 2024 Apr 19.
Article in English | MEDLINE | ID: mdl-38640472

ABSTRACT

BACKGROUND: Amyloidosis, a rare multisystem condition, often requires complex, multidisciplinary care. Its low prevalence underscores the importance of efforts to ensure the availability of high-quality patient education materials for better outcomes. ChatGPT (OpenAI) is a large language model powered by artificial intelligence that offers a potential avenue for disseminating accurate, reliable, and accessible educational resources for both patients and providers. Its user-friendly interface, engaging conversational responses, and the capability for users to ask follow-up questions make it a promising future tool in delivering accurate and tailored information to patients. OBJECTIVE: We performed a multidisciplinary assessment of the accuracy, reproducibility, and readability of ChatGPT in answering questions related to amyloidosis. METHODS: In total, 98 amyloidosis questions related to cardiology, gastroenterology, and neurology were curated from medical societies, institutions, and amyloidosis Facebook support groups and inputted into ChatGPT-3.5 and ChatGPT-4. Cardiology- and gastroenterology-related responses were independently graded by a board-certified cardiologist and gastroenterologist, respectively, who specialize in amyloidosis. These 2 reviewers (RG and DCK) also graded general questions for which disagreements were resolved with discussion. Neurology-related responses were graded by a board-certified neurologist (AAH) who specializes in amyloidosis. Reviewers used the following grading scale: (1) comprehensive, (2) correct but inadequate, (3) some correct and some incorrect, and (4) completely incorrect. Questions were stratified by categories for further analysis. Reproducibility was assessed by inputting each question twice into each model. The readability of ChatGPT-4 responses was also evaluated using the Textstat library in Python (Python Software Foundation) and the Textstat readability package in R software (R Foundation for Statistical Computing). RESULTS: ChatGPT-4 (n=98) provided 93 (95%) responses with accurate information, and 82 (84%) were comprehensive. ChatGPT-3.5 (n=83) provided 74 (89%) responses with accurate information, and 66 (79%) were comprehensive. When examined by question category, ChatGTP-4 and ChatGPT-3.5 provided 53 (95%) and 48 (86%) comprehensive responses, respectively, to "general questions" (n=56). When examined by subject, ChatGPT-4 and ChatGPT-3.5 performed best in response to cardiology questions (n=12) with both models producing 10 (83%) comprehensive responses. For gastroenterology (n=15), ChatGPT-4 received comprehensive grades for 9 (60%) responses, and ChatGPT-3.5 provided 8 (53%) responses. Overall, 96 of 98 (98%) responses for ChatGPT-4 and 73 of 83 (88%) for ChatGPT-3.5 were reproducible. The readability of ChatGPT-4's responses ranged from 10th to beyond graduate US grade levels with an average of 15.5 (SD 1.9). CONCLUSIONS: Large language models are a promising tool for accurate and reliable health information for patients living with amyloidosis. However, ChatGPT's responses exceeded the American Medical Association's recommended fifth- to sixth-grade reading level. Future studies focusing on improving response accuracy and readability are warranted. Prior to widespread implementation, the technology's limitations and ethical implications must be further explored to ensure patient safety and equitable implementation.

5.
Surg Endosc ; 38(5): 2522-2532, 2024 May.
Article in English | MEDLINE | ID: mdl-38472531

ABSTRACT

BACKGROUND: The readability of online bariatric surgery patient education materials (PEMs) often surpasses the recommended 6th grade level. Large language models (LLMs), like ChatGPT and Bard, have the potential to revolutionize PEM delivery. We aimed to evaluate the readability of PEMs produced by U.S. medical institutions compared to LLMs, as well as the ability of LLMs to simplify their responses. METHODS: Responses to frequently asked questions (FAQs) related to bariatric surgery were gathered from top-ranked health institutions. FAQ responses were also generated from GPT-3.5, GPT-4, and Bard. LLMs were then prompted to improve the readability of their initial responses. The readability of institutional responses, initial LLM responses, and simplified LLM responses were graded using validated readability formulas. Accuracy and comprehensiveness of initial and simplified LLM responses were also compared. RESULTS: Responses to 66 FAQs were included. All institutional and initial LLM responses had poor readability, with average reading levels ranging from 9th grade to college graduate. Simplified responses from LLMs had significantly improved readability, with reading levels ranging from 6th grade to college freshman. When comparing simplified LLM responses, GPT-4 responses demonstrated the highest readability, with reading levels ranging from 6th to 9th grade. Accuracy was similar between initial and simplified responses from all LLMs. Comprehensiveness was similar between initial and simplified responses from GPT-3.5 and GPT-4. However, 34.8% of Bard's simplified responses were graded as less comprehensive compared to initial. CONCLUSION: Our study highlights the efficacy of LLMs in enhancing the readability of bariatric surgery PEMs. GPT-4 outperformed other models, generating simplified PEMs from 6th to 9th grade reading levels. Unlike GPT-3.5 and GPT-4, Bard's simplified responses were graded as less comprehensive. We advocate for future studies examining the potential role of LLMs as dynamic and personalized sources of PEMs for diverse patient populations of all literacy levels.


Subject(s)
Bariatric Surgery , Comprehension , Patient Education as Topic , Humans , Patient Education as Topic/methods , Internet , Health Literacy , Language , United States
7.
Am Surg ; : 31348241230093, 2024 Feb 02.
Article in English | MEDLINE | ID: mdl-38305212

ABSTRACT

There are currently no studies examining differences in perceptions and expected impact of the Step 1 score change to pass/fail between surgical and non-surgical program directors (PDs). We conducted a systematic review in May 2023 of PubMed, Scopus, Web of Science, and PSYCInfo to evaluate studies examining PDs' perspectives regarding the Step 1 score change. We performed random-effects meta-analyses to determine differences in perspectives among surgical and non-surgical PDs. Surgical PDs (76.8% [95% CI, 72.1%-82.0%], I2 = 52%) reported significantly greater rates of disagreement with the score change compared to non-surgical (65.1% [95% CI, 57.9%-73.1%], I2 = 69.7%) (P = .01). Surgical PDs also reported significantly greater rates of agreement that the score change will increase the difficulty in objectively comparing applicants (88.1% [95% CI, 84.6%-91.7%], I2 = 16.4%), compared to non-surgical (81.0% [95% CI, 75.6%-86.8%], I2 = 72.6%) (P = .04). There was less heterogeneity among non-surgical PDs (88.7% [95% CI, 86.2%-91.2%], I2 = 0%), compared to surgical (84.7% [95% CI, 79.0%-90.8%], I2 = 67.3%), regarding expected increases in emphasis on Step 2, although the difference in rates of agreement was not statistically significant. Overall, there is significant heterogeneity in the literature regarding expected changes in the residency application review process. Most PDs reported significant disagreement with the score change, greater expected difficulty in objectively evaluating applicants, and greater emphasis on Step 2, with surgical PDs reporting greater rates of disagreement, greater expected difficulty, and heterogeneity regarding expected increases in emphasis on Step 2, compared to non-surgical. Additionally, there is significant heterogeneity in the overall literature regarding expected changes in the residency application review process. Further research is needed to establish evidence-based guidelines that improve the overall residency application process for all stakeholders.

8.
NPJ Digit Med ; 7(1): 22, 2024 Jan 26.
Article in English | MEDLINE | ID: mdl-38279034

ABSTRACT

The increasing need for mental health support and a shortage of therapists have led to the development of the eXtended-reality Artificial Intelligence Assistant (XAIA). This platform combines spatial computing, virtual reality (VR), and artificial intelligence (AI) to provide immersive mental health support. Utilizing GPT-4 for AI-driven therapy, XAIA engaged participants with mild-to-moderate anxiety or depression in biophilic VR environments. Speaking with an AI therapy avatar in VR was considered acceptable, helpful, and safe, with participants observed to engage genuinely with the program. However, some still favored human interaction and identified shortcomings with using a digital VR therapist. The study provides initial evidence of the acceptability and safety of AI psychotherapy via spatial computing, warranting further research on technical enhancements and clinical impact.

13.
Obes Surg ; 33(11): 3571-3601, 2023 11.
Article in English | MEDLINE | ID: mdl-37740831

ABSTRACT

Bariatric surgery remains underutilized despite its proven efficacy in the management of obesity. Provider perceptions of bariatric surgery are important to consider when discussing utilization rates. PubMed, SCOPUS, and OVID databases were searched in April 2023, and 40 published studies discussing providers' knowledge and perceptions of bariatric surgery were included. There were generally positive perceptions of the efficacy of bariatric surgery, although overestimations of surgical risks and postoperative complications were common. Providers' previous training was associated with knowledge and perceptions of bariatric surgery and familiarity with perioperative management across studies. These perceptions were also associated with referral rates, suggesting that inadequate provider knowledge may contribute to bariatric surgery underutilization. We advocate for increased bariatric surgery-related education throughout all stages of medical training and across specialties.


Subject(s)
Bariatric Surgery , Obesity, Morbid , Humans , Obesity, Morbid/surgery , Obesity/surgery , Postoperative Complications , Referral and Consultation
14.
Arab J Gastroenterol ; 24(3): 145-148, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37673708

ABSTRACT

BACKGROUND AND STUDY AIMS: Cirrhosis is a chronic progressive disease which requires complex care. Its incidence is rising in the Arab countries making it the 7th leading cause of death in the Arab League in 2010. ChatGPT is a large language model with a growing body of literature demonstrating its ability to answer clinical questions. We examined ChatGPT's accuracy in responding to cirrhosis related questions in Arabic and compared its performance to English. MATERIALS AND METHODS: ChatGPTs responses to 91 questions in Arabic and English were graded by a transplant hepatologist fluent in both languages. Accuracy of responses was assessed using the scale: 1. Comprehensive, 2. Correct but inadequate, 3. Mixed with correct and incorrect/outdated data, and 4. Completely incorrect.Accuracy of Arabic compared to English responses was assessed using the scale: 1. Arabic response is more accurate, 2. Similar accuracy, 3. Arabic response is less accurate. RESULTS: The model provided 22 (24.2%) comprehensive, 44 (48.4%) correct but inadequate, 13 (14.3%) mixed with correct and incorrect/outdated data and 12 (13.2%) completely incorrect Arabic responses. When comparing the accuracy of Arabic and English responses, 9 (9.9%) of the Arabic responses were graded as more accurate, 52 (57.1%) similar in accuracy and 30 (33.0%) as less accurate compared to English. CONCLUSION: ChatGPT has the potential to serve as an adjunct source of information for Arabic speaking patients with cirrhosis. The model provided correct responses in Arabic to 72.5% of questions, although its performance in Arabic was less accurate than in English. The model produced completely incorrect responses to 13.2% of questions, reinforcing its potential role as an adjunct and not replacement of care by licensed healthcare professionals. Future studies to refine this technology are needed to help Arabic speaking patients with cirrhosis across the globe understand their disease and improve their outcomes.

15.
Pancreas ; 52(2): e115-e120, 2023 02 01.
Article in English | MEDLINE | ID: mdl-37523602

ABSTRACT

OBJECTIVES: The aim of this study was to assess the safety, feasibility, and reproducibility of endoscopic ultrasound shear wave elastography (EUS-SWE) in the pancreas. METHODS: This is a prospective registry of consecutive patients undergoing clinically indicated EUS. Ten readings of SWE velocities (Vs [distance/time, m/s]) were obtained in the head (HOP), body, and tail of pancreas to quantify tissue stiffness. Each Vs score was accompanied by a reliability measurement VsN (%) with VsN >50% considered reliable. Safety was evaluated by perioperative complications rate. Feasibility was determined by technical success of obtaining measurements. Reproducibility was evaluated using intraclass correlation coefficient analysis. RESULTS: Total of 3320 EUS-SWE measurements were performed on 117 patients without perioperative complications. Measurement success rate was 100% across all locations. Reliable measurements were more common in the HOP (953/1120 [85.1%]) followed by body (853/1130 [75.5%]) and tail of pancreas (687/1070 [64.2%]) (P < 0.001). The analysis showed good reproducibility in all locations (intraclass correlation coefficient range, 0.80-0.89). CONCLUSIONS: Endoscopic ultrasound-SWE is safe, has 100% technical success rate, and is highly reproducible when used in the pancreas. Our study suggests that SWE measurements in the HOP offer the highest reliability, likely because of large study area and less respiratory artifact.


Subject(s)
Elasticity Imaging Techniques , Pancreas , Ultrasonography, Interventional , Humans , Feasibility Studies , Pancreas/diagnostic imaging , Reproducibility of Results
20.
Obes Surg ; 33(6): 1790-1796, 2023 06.
Article in English | MEDLINE | ID: mdl-37106269

ABSTRACT

PURPOSE: ChatGPT is a large language model trained on a large dataset covering a broad range of topics, including the medical literature. We aim to examine its accuracy and reproducibility in answering patient questions regarding bariatric surgery. MATERIALS AND METHODS: Questions were gathered from nationally regarded professional societies and health institutions as well as Facebook support groups. Board-certified bariatric surgeons graded the accuracy and reproducibility of responses. The grading scale included the following: (1) comprehensive, (2) correct but inadequate, (3) some correct and some incorrect, and (4) completely incorrect. Reproducibility was determined by asking the model each question twice and examining difference in grading category between the two responses. RESULTS: In total, 151 questions related to bariatric surgery were included. The model provided "comprehensive" responses to 131/151 (86.8%) of questions. When examined by category, the model provided "comprehensive" responses to 93.8% of questions related to "efficacy, eligibility and procedure options"; 93.3% related to "preoperative preparation"; 85.3% related to "recovery, risks, and complications"; 88.2% related to "lifestyle changes"; and 66.7% related to "other". The model provided reproducible answers to 137 (90.7%) of questions. CONCLUSION: The large language model ChatGPT often provided accurate and reproducible responses to common questions related to bariatric surgery. ChatGPT may serve as a helpful adjunct information resource for patients regarding bariatric surgery in addition to standard of care provided by licensed healthcare professionals. We encourage future studies to examine how to leverage this disruptive technology to improve patient outcomes and quality of life.


Subject(s)
Bariatric Surgery , Obesity, Morbid , Humans , Quality of Life , Reproducibility of Results , Obesity, Morbid/surgery , Language
SELECTION OF CITATIONS
SEARCH DETAIL
...